An Improved Automatic Term Recognition Method for Spanish
نویسندگان
چکیده
The C-value/NC-value algorithm, a hybrid approach to automatic term recognition, has been originally developed to extract multiword term candidates from specialised documents written in English. Here, we present three main modifications to this algorithm that affect how the obtained output is refined. The first modification aims to maximise the number of real terms in the list of candidates with a new approach for the stop-list application process. The second modification adapts the C-value calculation formula in order to consider single word terms. The third modification changes how the term candidates are grouped, exploiting a lemmatised version of the input corpus. Additionally, size of candidate’s context window is variable. We also show the necessary linguistic modifications to apply this algorithm to the recognition of term candidates in Spanish.
منابع مشابه
Algorithms and Methods for the Automatic Speech Recognition in Spanish Language using Syllables
This work examines the results of incorporating into Automatic Speech Recognition the syllable units for the Spanish language. Because of the boundaries between phonemes-like units its often difficult to elicit them; the use of these has not reached a good performance in Automatic Speech Recognition. In the course of the developing the experiments three approaches for the segmentation task were...
متن کاملAutomatic Face Recognition via Local Directional Patterns
Automatic facial recognition has many potential applications in different areas of humancomputer interaction. However, they are not yet fully realized due to the lack of an effectivefacial feature descriptor. In this paper, we present a new appearance based feature descriptor,the local directional pattern (LDP), to represent facial geometry and analyze its performance inrecognition. An LDP feat...
متن کاملEvaluation of the Parameters Involved in the Iris Recognition System
Biometric recognition is an automatic identification method which is based on unique features or characteristics possessed by human beings and Iris recognition has proved itself as one of the most reliable biometric methods available owing to the accuracy provided by its unique epigenetic patterns. The main steps in any iris recognition system are image acquisition, iris segmentation, iris norm...
متن کاملDimensionality Reduction and Improving the Performance of Automatic Modulation Classification using Genetic Programming (RESEARCH NOTE)
This paper shows how we can make advantage of using genetic programming in selection of suitable features for automatic modulation recognition. Automatic modulation recognition is one of the essential components of modern receivers. In this regard, selection of suitable features may significantly affect the performance of the process. Simulations were conducted with 5db and 10db SNRs. Test and ...
متن کاملAn Improved Automatic EEG Signal Segmentation Method based on Generalized Likelihood Ratio
It is often needed to label electroencephalogram (EEG) signals by segments of similar characteristics that are particularly meaningful to clinicians and for assessment by neurophysiologists. Within each segment, the signals are considered statistically stationary, usually with similar characteristics such as amplitude and/or frequency. In order to detect the segments boundaries of a signal, we ...
متن کامل